fix: validate tensor dims_count against the shared-memory region by LinZiyuu · Pull Request #438 · triton-inference-server/python_backend

LinZiyuu · 2026-05-30T22:14:17Z

This validates the tensor dims_count read from shared memory before it is used, extending the shared-memory boundary validation added in #405 / #406.

Previously, PbTensor::LoadFromSharedMemory() took dims_count from the shared-memory region and used it to compute name_offset and to build the dims vector (std::vector<int64_t>(dims_ptr, dims_ptr + dims_count)) with no bounds check. A corrupted dims_count makes the parent process perform a large out-of-bounds read and crash the server, and the sizeof(int64_t) * dims_count product can overflow into a small, controlled name_offset. The #406 fix bounded MemoryShm::byte_size, but not this dimension count.

This change validates dims_count so the dims array stays within the region before it is used, throwing PythonBackendException otherwise. The check uses division to avoid overflowing the product, and mirrors the MemoryShm::byte_size boundary check. Valid tensors are unaffected.

Reproduced on nvcr.io/nvidia/tritonserver:26.04-py3 (CPU): a model that overwrites a live output tensor's dims_count in the backend shared memory crashes the whole server (Exited (139), SIGSEGV) within seconds, via python_be.cc → InferResponse::LoadFromSharedMemory → PbTensor::LoadFromSharedMemory. With this change the corrupted tensor is rejected with an error instead of faulting.

The sibling unbounded values read from shared memory have the same pattern and are worth a follow-up: InferResponse::outputs_size, InferRequest::requested_output_count/input_count, PbMap::length, MessageQueue::Pop's tail, and the object handles passed to SharedMemoryManager::Load<T>.

PbTensor::LoadFromSharedMemory() reads `dims_count` from shared memory and uses it to compute `name_offset` and to construct the dims vector (`std::vector<int64_t>(dims_ptr, dims_ptr + dims_count)`) without checking it against the region. A corrupted `dims_count` (e.g. written by a model into the backend shm) makes the parent process perform a large out-of-bounds read, crashing the server; the `sizeof(int64_t) * dims_count` product can also overflow and yield a small, controlled `name_offset`. Validate `dims_count` so that the dims array stays within the shared-memory region before it is used, throwing PythonBackendException otherwise. The check uses division to avoid overflowing the product, and mirrors the MemoryShm::byte_size boundary check. Valid tensors are unaffected. Signed-off-by: LinZiyuu <linziyu0205@163.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: validate tensor dims_count against the shared-memory region#438

fix: validate tensor dims_count against the shared-memory region#438
LinZiyuu wants to merge 1 commit into
triton-inference-server:mainfrom
LinZiyuu:fix/validate-tensor-dims-count

LinZiyuu commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

LinZiyuu commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant